Forth (programming Language)
   HOME

TheInfoList



OR:

Forth is a procedural, stack-oriented
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
and interactive environment designed by Charles H. "Chuck" Moore and first used by other programmers in 1970. Although not an
acronym An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...
, the language's name in its early years was often spelled in all capital letters as ''FORTH''. The FORTH-79 and FORTH-83 implementations, which were not written by Moore, became de facto standards, and an official standardization of the language was published in 1994 as ANS Forth. A wide range of Forth derivatives existed before and after ANS Forth. Forth typically combines a compiler with an integrated command shell, where the user interacts via
subroutine In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Functions may ...
s called ''words''. Words can be defined, tested, redefined, and debugged without recompiling or restarting the whole program. All syntactic elements, including variables and basic operators, are defined as words. A stack is used to pass parameters between words, leading to a
Reverse Polish Notation Reverse Polish notation (RPN), also known as reverse Łukasiewicz notation, Polish postfix notation or simply postfix notation, is a mathematical notation in which operators ''follow'' their operands, in contrast to Polish notation (PN), in whi ...
style. For much of Forth's existence, the standard technique was to compile to
threaded code In computer science, threaded code is a programming technique where the code has a form that essentially consists entirely of calls to subroutines. It is often used in compilers, which may generate code in that form or be implemented in that fo ...
, which can be interpreted faster than
bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
. One of the early benefits of Forth was size: an entire development environment—including compiler, editor, and user programs—could fit in memory on an 8-bit or similarly limited system. No longer constrained by space, there are modern implementations that generate optimized
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very ...
like other language compilers. Forth is used in the
Open Firmware Open Firmware is a standard defining the interfaces of a computer firmware system, formerly endorsed by the Institute of Electrical and Electronics Engineers (IEEE). It originated at Sun Microsystems, where it was known as OpenBoot, and has bee ...
boot loader A bootloader, also spelled as boot loader or called boot manager and bootstrap loader, is a computer program that is responsible for booting a computer. When a computer is turned off, its softwareincluding operating systems, application code, a ...
, in
space Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually consider ...
applicationsNASA applications of Forth
(original NASA server no longer running, copy from archive.org)
such as the Philae spacecraft, and in other embedded systems which involve interaction with hardware. The relative simplicity of creating a basic Forth system has led to many personal and proprietary variants, such as the custom Forth used to implement the bestselling 1986 video game ''
Starflight ''Starflight'' is a space exploration, combat, and trading role-playing video game created by Binary Systems and published by Electronic Arts in 1986. Originally developed for IBM PC compatibles, it was later ported to the Amiga, Atari ST, ...
'' from
Electronic Arts Electronic Arts Inc. (EA) is an American video game company headquartered in Redwood City, California. Founded in May 1982 by Apple employee Trip Hawkins, the company was a pioneer of the early home computer game industry and promoted the d ...
. The free software
Gforth Gforth is a free and portable implementation of the Forth programming language for Unix-like systems, Microsoft Windows, and other operating systems. A primary goal of Gforth is to adhere to the ANS Forth standard. Gforth is free software as par ...
implementation is actively maintained, as are several commercially supported systems. Moore later developed a series of microprocessors for executing compiled Forth-like code directly and experimented with smaller languages based on Forth concepts, including cmForth and
colorForth colorForth is a programming language from the Forth language's creator, Charles H. Moore, developed in the 1990s. The language combines elements of Moore's earlier Forth systems and adds color as a way of indicating how words should be interpre ...
.


Uses

Forth has a niche in astronomical and space applications as well as a history in
embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' as ...
s. The
Open Firmware Open Firmware is a standard defining the interfaces of a computer firmware system, formerly endorsed by the Institute of Electrical and Electronics Engineers (IEEE). It originated at Sun Microsystems, where it was known as OpenBoot, and has bee ...
boot ROMs used by
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple fruit tree, trees are agriculture, cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, wh ...
, IBM,
Sun The Sun is the star at the center of the Solar System. It is a nearly perfect ball of hot plasma, heated to incandescence by nuclear fusion reactions in its core. The Sun radiates this energy mainly as light, ultraviolet, and infrared radi ...
, and
OLPC XO-1 The OLPC XO (formerly known as $100 Laptop, Children's Machine, 2B1) is a low cost laptop computer intended to be distributed to children in developing countries around the world, to provide them with access to knowledge, and opportunities to " ...
contain a Forth environment. Forth has often been used to bring up new hardware. Forth was the first resident software on the new
Intel 8086 The 8086 (also called iAPX 86) is a 16-bit microprocessor chip designed by Intel between early 1976 and June 8, 1978, when it was released. The Intel 8088, released July 1, 1979, is a slightly modified chip with an external 8-bit data bus (allowi ...
chip in 1978, and MacFORTH was the first resident development system for the
Macintosh 128K The Apple Macintosh—later rebranded as the Macintosh 128K—is the original Apple Inc., Apple Macintosh personal computer. It played a pivotal role in establishing desktop publishing as a general office function. The motherboard, a CRT monit ...
in 1984.
Atari, Inc. Atari, Inc. was an American video game developer and home computer company founded in 1972 by Nolan Bushnell and Ted Dabney. Atari was a key player in the formation of the video arcade and video game industry. Based primarily around the Sunny ...
used an elaborate animated demo written in Forth to showcase capabilities of the Atari 400 and 800 computers in department stores. Three home computer games from
Electronic Arts Electronic Arts Inc. (EA) is an American video game company headquartered in Redwood City, California. Founded in May 1982 by Apple employee Trip Hawkins, the company was a pioneer of the early home computer game industry and promoted the d ...
, published in the 1980s, were written in Forth: ''
Worms? ''Worms?'' is a software toy written by David Maynard for the Atari 8-bit family and ported to the Commodore 64. Published by Electronic Arts in 1983, it was one of initial batch of releases from the company. ''Worms?'' is an interactive version ...
'' (1983), ''
Starflight ''Starflight'' is a space exploration, combat, and trading role-playing video game created by Binary Systems and published by Electronic Arts in 1986. Originally developed for IBM PC compatibles, it was later ported to the Amiga, Atari ST, ...
'' (1986), and '' Lords of Conquest'' (1986). The
Canon Cat Canon Cat is a task-dedicated desktop computer released by Canon Inc. in 1987 at the price of U.S. $1,495. On the surface, it was not unlike dedicated word processors popular in the late 1970s to early 1980s, but it was far more powerful, and in ...
(1987) uses Forth for its system programming. Rockwell produced single-chip microcomputers with resident Forth kernels: the R65F11 and R65F12. ASYST was a Forth expansion for measuring and controlling on PCs.Campbell et al, "Up and Running with Asyst 2.0", MacMillan Software Co., 1987


History

Forth evolved from
Charles H. Moore Charles Havice Moore II (born 9 September 1938), better known as Chuck Moore, is an American computer engineer and programmer, best known for inventing the Forth programming language in 1968. He cofounded FORTH, Inc., with Elizabeth Rather in ...
's personal programming system, which had been in continuous development since 1968. Forth was first exposed to other programmers in the early 1970s, starting with Elizabeth Rather at the United States National Radio Astronomy Observatory (NRAO). After their work at NRAO, Charles Moore and Elizabeth Rather formed FORTH, Inc. in 1973, refining and porting Forth systems to dozens of other platforms in the next decade. Forth is so-named, because in 1968 "the file holding the interpreter was labeled FOURTH, for 4th (next) generation software, but the
IBM 1130 The IBM 1130 Computing System, introduced in 1965, was IBM's least expensive computer at that time. A binary 16-bit machine, it was marketed to price-sensitive, computing-intensive technical markets, like education and engineering, succeeding th ...
operating system restricted file names to five characters." Moore saw Forth as a successor to compile-link-go
third-generation programming language A third-generation programming language (3GL) is a high-level computer programming language that tends to be more machine-independent and programmer-friendly than the machine code of the first-generation and assembly languages of the second-gene ...
s, or software for "fourth generation" hardware. FORTH, Inc.'s microFORTH was developed for the
Intel 8080 The Intel 8080 (''"eighty-eighty"'') is the second 8-bit microprocessor designed and manufactured by Intel. It first appeared in April 1974 and is an extended and enhanced variant of the earlier 8008 design, although without binary compatibil ...
,
Motorola 6800 The 6800 ("''sixty-eight hundred''") is an 8-bit computing, 8-bit microprocessor designed and first manufactured by Motorola in 1974. The MC6800 microprocessor was part of the Motorola 6800 family, M6800 Microcomputer System (latter dubbed ''68xx' ...
,
Zilog Z80 The Z80 is an 8-bit microprocessor introduced by Zilog as the startup company's first product. The Z80 was conceived by Federico Faggin in late 1974 and developed by him and his 11 employees starting in early 1975. The first working samples wer ...
, and
RCA 1802 The COSMAC (Complementary Symmetry Monolithic Array Computer) is an 8-bit microprocessor family introduced by RCA. It is historically notable as the first CMOS microprocessor. The first production model was the two-chip CDP1801R and CDP1801U, wh ...
microprocessors, starting in 1976. MicroFORTH was later used by hobbyists to generate Forth systems for other architectures, such as the
6502 The MOS Technology 6502 (typically pronounced "sixty-five-oh-two" or "six-five-oh-two") William Mensch and the moderator both pronounce the 6502 microprocessor as ''"sixty-five-oh-two"''. is an 8-bit microprocessor that was designed by a small te ...
in 1978. The Forth Interest Group was formed in 1978. It promoted and distributed its own version of the language, FIG-Forth, for most makes of home computer. Forth was popular in the early 1980s, because it was well suited to the limited memory of
microcomputer A microcomputer is a small, relatively inexpensive computer having a central processing unit (CPU) made out of a microprocessor. The computer also includes memory and input/output (I/O) circuitry together mounted on a printed circuit board (PC ...
s. The ease of implementing the language led to many implementations. The British
Jupiter ACE The Jupiter Ace by Jupiter Cantab was a British home computer of the early 1980s. The Ace differed from other microcomputers of the time in that its programming environment used Forth instead of the more popular BASIC. After Jupiter Cantab ceas ...
home computer has Forth in its
ROM Rom, or ROM may refer to: Biomechanics and medicine * Risk of mortality, a medical classification to estimate the likelihood of death for a patient * Rupture of membranes, a term used during pregnancy to describe a rupture of the amniotic sac * ...
-resident operating system. Insoft GraFORTH is a version of Forth with graphics extensions for the Apple II. Common practice was codified in the de facto standards FORTH-79 and FORTH-83 in the years 1979 and 1983, respectively. These standards were unified by
ANSI The American National Standards Institute (ANSI ) is a private non-profit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organi ...
in 1994, commonly referred to as Forth. As of 2018, the source for the original 1130 version of FORTH has been recovered, and is now being updated to run on a restored or emulated 1130 system.


Overview

Forth emphasizes the use of small, simple functions called ''words''. Words for bigger tasks call upon many smaller words that each accomplish a distinct sub-task. A large Forth program is a hierarchy of words. These words, being distinct modules that communicate implicitly via a stack mechanism, can be prototyped, built and tested independently. The highest level of Forth code may resemble an English-language description of the application. Forth has been called a ''meta-application language'': a language that can be used to create problem-oriented languages. Forth relies on explicit use of a data stack and
reverse Polish notation Reverse Polish notation (RPN), also known as reverse Łukasiewicz notation, Polish postfix notation or simply postfix notation, is a mathematical notation in which operators ''follow'' their operands, in contrast to Polish notation (PN), in whi ...
which is commonly used in calculators from
Hewlett-Packard The Hewlett-Packard Company, commonly shortened to Hewlett-Packard ( ) or HP, was an American multinational information technology company headquartered in Palo Alto, California. HP developed and provided a wide variety of hardware components ...
. In RPN, the operator is placed after its operands, as opposed to the more common
infix notation Infix notation is the notation commonly used in arithmetical and logical formulae and statements. It is characterized by the placement of operators between operands—" infixed operators"—such as the plus sign in . Usage Binary relations a ...
where the operator is placed between its operands. Postfix notation makes the language easier to parse and extend; Forth's flexibility makes a static BNF grammar inappropriate, and it does not have a monolithic compiler. Extending the compiler only requires writing a new word, instead of modifying a grammar and changing the underlying implementation. Using RPN, one can get the result of the mathematical expression (25 * 10 + 50) this way: 25 10 * 50 + CR . 300 ok First the numbers 25 and 10 are put on the stack.
The word * takes the top two numbers from the stack, multiplies them, and puts the product back on the stack. Then the number 50 is placed on the stack.
The word + adds the top two values, pushing the sum. CR (
carriage return A carriage return, sometimes known as a cartridge return and often shortened to CR, or return, is a control character or mechanism used to reset a device's position to the beginning of a line of text. It is closely associated with the line feed ...
) starts the output on a new line. Finally, . prints the result. As everything has completed successfully, the Forth system prints OK. Even Forth's structural features are stack-based. For example: : FLOOR5 ( n -- n' ) DUP 6 < IF DROP 5 ELSE 1 - THEN ; The colon indicates the beginning of a new definition, in this case a new word (again, ''word'' is the term used for a subroutine) called FLOOR5. The text in parentheses is a comment, advising that this word expects a number on the stack and will return a possibly changed number (on the stack). The subroutine uses the following commands: DUP duplicates the number on the stack; 6 pushes a 6 on top of the stack; < compares the top two numbers on the stack (6 and the DUPed input), and replaces them with a true-or-false value; IF takes a true-or-false value and chooses to execute commands immediately after it or to skip to the ELSE; DROP discards the value on the stack; 5 pushes a 5 on top of the stack; and THEN ends the conditional. The FLOOR5 word is equivalent to this function written in the
C programming language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as ...
using the
ternary operator In mathematics, a ternary operation is an ''n''-ary operation with ''n'' = 3. A ternary operation on a set ''A'' takes any given three elements of ''A'' and combines them to form a single element of ''A''. In computer science, a ternary operator i ...
'?:' int floor5(int v) This function is written more succinctly as: : FLOOR5 ( n -- n' ) 1- 5 MAX ; This can be run as follows: 1 FLOOR5 CR . 5 ok 8 FLOOR5 CR . 7 ok First a number (1 or 8) is pushed onto the stack, FLOOR5 is called, which pops the number again and pushes the result. CR moves the output to a new line (again, this is only here for readability). Finally, a call to . pops the result and prints.


Facilities

Forth's
grammar In linguistics, the grammar of a natural language is its set of structure, structural constraints on speakers' or writers' composition of clause (linguistics), clauses, phrases, and words. The term can also refer to the study of such constraint ...
has no official specification. Instead, it is defined by a simple algorithm. The interpreter reads a line of input from the user input device, which is then parsed for a word using spaces as a delimiter; some systems recognise additional whitespace characters. When the interpreter finds a word, it looks the word up in the ''dictionary''. If the word is found, the interpreter executes the code associated with the word, and then returns to parse the rest of the input stream. If the word isn't found, the word is assumed to be a number and an attempt is made to convert it into a number and push it on the stack; if successful, the interpreter continues parsing the input stream. Otherwise, if both the lookup and the number conversion fail, the interpreter prints the word followed by an error message indicating that the word is not recognised, flushes the input stream, and waits for new user input. The definition of a new word is started with the word : (colon) and ends with the word ; (semi-colon). For example, : X DUP 1+ . . ; will compile the word X, and makes the name findable in the dictionary. When executed by typing 10 X at the console this will print 11 10. Most Forth systems include an
assembler Assembler may refer to: Arts and media * Nobukazu Takemura, avant-garde electronic musician, stage name Assembler * Assemblers, a fictional race in the ''Star Wars'' universe * Assemblers, an alternative name of the superhero group Champions of ...
to write words using the processor's facilities. Forth assemblers often use a reverse Polish syntax in which the parameters of an instruction precede the instruction. A typical reverse Polish assembler prepares the operands on the stack and the mnemonic copies the whole instruction into memory as the last step. A Forth assembler is by nature a macro assembler, so that it is easy to define an alias for registers according to their role in the Forth system: e.g. "dsp" for the register used as the data stack pointer.


Operating system, files, and multitasking

Most Forth systems run under a host operating system such as
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
,
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
or a version of
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
and use the host operating system's file system for source and data files; the ANSI Forth Standard describes the words used for I/O. All modern Forth systems use normal text files for source, even if they are embedded. An embedded system with a resident compiler gets its source via a serial line. Classic Forth systems traditionally use neither
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
nor
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
. Instead of storing code in files, source code is stored in disk blocks written to physical disk addresses. The word BLOCK is employed to translate the number of a 1K-sized block of disk space into the address of a buffer containing the data, which is managed automatically by the Forth system. Block use has become rare since the mid-1990s. In a hosted system those blocks too are allocated in a normal file in any case. Multitasking, most commonly
cooperative A cooperative (also known as co-operative, co-op, or coop) is "an autonomous association of persons united voluntarily to meet their common economic, social and cultural needs and aspirations through a jointly owned and democratically-control ...
round-robin scheduling Round-robin (RR) is one of the algorithms employed by process and network schedulers in computing. Guowang Miao, Jens Zander, Ki Won Sung, and Ben Slimane, Fundamentals of Mobile Data Networks, Cambridge University Press, , 2016. As the term ...
, is normally available (although multitasking words and support are not covered by the ANSI Forth Standard). The word PAUSE is used to save the current task's execution context, to locate the next task, and restore its execution context. Each task has its own stacks, private copies of some control variables and a scratch area. Swapping tasks is simple and efficient; as a result, Forth multitaskers are available even on very simple
microcontroller A microcontroller (MCU for ''microcontroller unit'', often also MC, UC, or μC) is a small computer on a single VLSI integrated circuit (IC) chip. A microcontroller contains one or more CPUs (processor cores) along with memory and programmable i ...
s, such as the
Intel 8051 The Intel MCS-51 (commonly termed 8051) is a single chip microcontroller (MCU) series developed by Intel in 1980 for use in embedded systems. The architect of the Intel MCS-51 instruction set was John H. Wharton. Intel's original versions were pop ...
, Atmel AVR, and TI MSP430. Other non-standard facilities include a mechanism for issuing
call Call or Calls may refer to: Arts, entertainment, and media Games * Call, a type of betting in poker * Call, in the game of contract bridge, a bid, pass, double, or redouble in the bidding stage Music and dance * Call (band), from Lahore, Paki ...
s to the host OS or
windowing system In computing, a windowing system (or window system) is software that manages separately different parts of display screens. It is a type of graphical user interface (GUI) which implements the WIMP (windows, icons, menus, pointer) paradigm fo ...
s, and many provide extensions that employ the scheduling provided by the operating system. Typically they have a larger and different set of words from the stand-alone Forth's PAUSE word for task creation, suspension, destruction and modification of priority.


Self-compilation and cross compilation

A full-featured Forth system with all source code will compile itself, a technique commonly called meta-compilation or self-hosting, by Forth programmers (although the term doesn't exactly match meta-compilation as it is normally defined). The usual method is to redefine the handful of words that place compiled bits into memory. The compiler's words use specially named versions of fetch and store that can be redirected to a buffer area in memory. The buffer area simulates or accesses a memory area beginning at a different address than the code buffer. Such compilers define words to access both the target computer's memory, and the host (compiling) computer's memory. After the fetch and store operations are redefined for the code space, the compiler, assembler, etc. are recompiled using the new definitions of fetch and store. This effectively reuses all the code of the compiler and interpreter. Then, the Forth system's code is compiled, but this version is stored in the buffer. The buffer in memory is written to disk, and ways are provided to load it temporarily into memory for testing. When the new version appears to work, it is written over the previous version. Numerous variations of such compilers exist for different environments. For
embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' as ...
s, the code may instead be written to another computer, a technique known as cross compilation, over a serial port or even a single
TTL TTL may refer to: Photography * Through-the-lens metering, a camera feature * Zenit TTL, an SLR film camera named for its TTL metering capability Technology * Time to live, a computer data lifespan-limiting mechanism * Transistor–transistor lo ...
bit, while keeping the word names and other non-executing parts of the dictionary in the original compiling computer. The minimum definitions for such a Forth compiler are the words that fetch and store a byte, and the word that commands a Forth word to be executed. Often the most time-consuming part of writing a remote port is constructing the initial program to implement fetch, store and execute, but many modern microprocessors have integrated debugging features (such as the Motorola CPU32) that eliminate this task.


Structure of the language

The basic data structure of Forth is the "dictionary" which maps "words" to executable code or named data structures. The dictionary is laid out in memory as a tree of
linked list In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whic ...
s with the links proceeding from the latest (most recently) defined word to the oldest, until a
sentinel value In computer programming, a sentinel value (also referred to as a flag value, trip value, rogue value, signal value, or dummy data) is a special value in the context of an algorithm which uses its presence as a condition of termination, typically in ...
, usually a NULL pointer, is found. A context switch causes a list search to start at a different leaf. A linked list search continues as the branch merges into the main trunk leading eventually back to the sentinel, the root. There can be several dictionaries. In rare cases such as meta-compilation a dictionary might be isolated and stand-alone. The effect resembles that of nesting namespaces and can overload keywords depending on the context. A defined word generally consists of ''head'' and ''body'' with the head consisting of the ''name field'' (NF) and the ''link field'' (LF), and body consisting of the ''code field'' (CF) and the ''parameter field'' (PF). Head and body of a dictionary entry are treated separately because they may not be contiguous. For example, when a Forth program is recompiled for a new platform, the head may remain on the compiling computer, while the body goes to the new platform. In some environments (such as
embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' as ...
s) the heads occupy memory unnecessarily. However, some cross-compilers may put heads in the target if the target itself is expected to support an interactive Forth.


Dictionary entry

The exact format of a dictionary entry is not prescribed, and implementations vary. However, certain components are almost always present, though the exact size and order may vary. Described as a structure, a dictionary entry might look this way: structure byte: flag \ length of word's name char-array: name \ name's runtime length isn't known at compile time address: previous \ link field, backward ptr to previous word address: codeword \ ptr to the code to execute this word any-array: parameterfield \ unknown length of data, words, or opcodes end-structure forthword The name field starts with a prefix giving the length of the word's name. The character representation of the word's name then follows the prefix. Depending on the particular implementation of Forth, there may be one or more NUL ('\0') bytes for alignment. The link field contains a pointer to the previously defined word. The pointer may be a relative displacement or an absolute address that points to the next oldest sibling. The code field pointer will be either the address of the word which will execute the code or data in the parameter field or the beginning of machine code that the processor will execute directly. For colon defined words, the code field pointer points to the word that will save the current Forth instruction pointer (IP) on the return stack, and load the IP with the new address from which to continue execution of words. This is the same as what a processor's call/return instructions do.


Structure of the compiler

The compiler itself is not a monolithic program. It consists of Forth words visible to the system, and usable by a programmer. This allows a programmer to change the compiler's words for special purposes. The "compile time" flag in the name field is set for words with "compile time" behavior. Most simple words execute the same code whether they are typed on a command line, or embedded in code. When compiling these, the compiler simply places code or a threaded pointer to the word. The classic examples of compile-time words are the
control structure In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an ''imp ...
s such as IF and WHILE. Almost all of Forth's control structures and almost all of its compiler are implemented as compile-time words. Apart from some rarely used
control flow In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an ''imper ...
words only found in a few implementations, such as the conditional return word ?EXIT used in Ulrich Hoffmann's preForth,Ulrich Hoffmann's preForth slides
/ref>Ulrich Hoffmann's preForth
/ref> all of Forth's
control flow In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an ''imper ...
words are executed during compilation to compile various combinations of primitive words along with their branch addresses. For instance, IF and WHILE, and the words that match with those, set up BRANCH (unconditional branch) and ?BRANCH (pop a value off the stack, and branch if it is false). Counted loop
control flow In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an ''imper ...
words work similarly but set up combinations of primitive words that work with a counter, and so on. During compilation, the data stack is used to support control structure balancing, nesting, and back-patching of branch addresses. The snippet: ... DUP 6 < IF DROP 5 ELSE 1 - THEN ... would be compiled to the following sequence inside a definition: ... DUP LIT 6 < ?BRANCH 5 DROP LIT 5 BRANCH 3 LIT 1 - ... The numbers after BRANCH represent relative jump addresses. LIT is the primitive word for pushing a "literal" number onto the data stack.


Compilation state and interpretation state

The word : (colon) parses a name as a parameter, creates a dictionary entry (a ''colon definition'') and enters compilation state. The interpreter continues to read space-delimited words from the user input device. If a word is found, the interpreter executes the ''compilation semantics'' associated with the word, instead of the ''interpretation semantics''. The default compilation semantics of a word are to append its interpretation semantics to the current definition. The word ; (semi-colon) finishes the current definition and returns to interpretation state. It is an example of a word whose compilation semantics differ from the default. The interpretation semantics of ; (semi-colon), most control flow words, and several other words are undefined in Forth, meaning that they must only be used inside of definitions and not on the interactive command line. The interpreter state can be changed manually with the words /nowiki> (left-bracket) and /nowiki> (right-bracket) which enter interpretation state or compilation state, respectively. These words can be used with the word LITERAL to calculate a value during a compilation and to insert the calculated value into the current colon definition. LITERAL has the compilation semantics to take an object from the data stack and to append semantics to the current colon definition to place that object on the data stack. In Forth, the current state of the interpreter can be read from the
flag A flag is a piece of fabric (most often rectangular or quadrilateral) with a distinctive design and colours. It is used as a symbol, a signalling device, or for decoration. The term ''flag'' is also used to refer to the graphic design empl ...
STATE which contains the value true when in compilation state and false otherwise. This allows the implementation of so-called ''state-smart words'' with behavior that changes according to the current state of the interpreter.


Immediate words

The word IMMEDIATE marks the most recent colon definition as an ''immediate word'', effectively replacing its compilation semantics with its interpretation semantics. Immediate words are normally executed during compilation, not compiled, but this can be overridden by the programmer in either state. ; is an example of an immediate word. In Forth, the word POSTPONE takes a name as a parameter and appends the compilation semantics of the named word to the current definition even if the word was marked immediate. Forth-83 defined separate words COMPILE and OMPILE/code> to force the compilation of non-immediate and immediate words, respectively.


Unnamed words and execution tokens

In Forth, unnamed words can be defined with the word :NONAME which compiles the following words up to the next ; (semi-colon) and leaves an ''execution token'' on the data stack. The execution token provides an opaque handle for the compiled semantics, similar to the
function pointer A function pointer, also called a subroutine pointer or procedure pointer, is a pointer that points to a function. As opposed to referencing a data value, a function pointer points to executable code within memory. Dereferencing the function poi ...
s of the
C programming language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as ...
. Execution tokens can be stored in variables. The word EXECUTE takes an execution token from the data stack and performs the associated semantics. The word COMPILE, (compile-comma) takes an execution token from the data stack and appends the associated semantics to the current definition. The word ' (tick) takes the name of a word as a parameter and returns the execution token associated with that word on the data stack. In interpretation state, ' RANDOM-WORD EXECUTE is equivalent to RANDOM-WORD.


Parsing words and comments

The words : (colon), POSTPONE, ' (tick) are examples of ''parsing words'' that take their arguments from the user input device instead of the data stack. Another example is the word ( (paren) which reads and ignores the following words up to and including the next right parenthesis and is used to place comments in a colon definition. Similarly, the word \ (backslash) is used for comments that continue to the end of the current line. To be parsed correctly, ( (paren) and \ (backslash) must be separated by whitespace from the following comment text.


Structure of code

In most Forth systems, the body of a code definition consists of either
machine language In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very ...
, or some form of
threaded code In computer science, threaded code is a programming technique where the code has a form that essentially consists entirely of calls to subroutines. It is often used in compilers, which may generate code in that form or be implemented in that fo ...
. The original Forth which follows the informal FIG standard (Forth Interest Group), is a TIL (Threaded Interpretive Language). This is also called indirect-threaded code, but direct-threaded and subroutine threaded Forths have also become popular in modern times. The fastest modern Forths, such as SwiftForth, VFX Forth, and iForth, compile Forth to native machine code.


Data objects

When a word is a variable or other data object, the CF points to the runtime code associated with the defining word that created it. A defining word has a characteristic "defining behavior" (creating a dictionary entry plus possibly allocating and initializing data space) and also specifies the behavior of an instance of the class of words constructed by this defining word. Examples include: ;VARIABLE :Names an uninitialized, one-cell memory location. Instance behavior of a VARIABLE returns its address on the stack. ;CONSTANT :Names a value (specified as an argument to CONSTANT). Instance behavior returns the value. ;CREATE :Names a location; space may be allocated at this location, or it can be set to contain a string or other initialized value. Instance behavior returns the address of the beginning of this space. Forth also provides a facility by which a programmer can define new application-specific defining words, specifying both a custom defining behavior and instance behavior. Some examples include circular buffers, named bits on an I/O port, and automatically indexed arrays. Data objects defined by these and similar words are global in scope. The function provided by local variables in other languages is provided by the data stack in Forth (although Forth also has real local variables). Forth programming style uses very few named data objects compared with other languages; typically such data objects are used to contain data which is used by a number of words or tasks (in a multitasked implementation). Forth does not enforce consistency of
data type In computer science and computer programming, a data type (or simply type) is a set of possible values and a set of allowed operations on it. A data type tells the compiler or interpreter how the programmer intends to use the data. Most progra ...
usage; it is the programmer's responsibility to use appropriate operators to fetch and store values or perform other operations on data.


Examples


“Hello, World!”

: HELLO ( -- ) CR ." Hello, World!" ; HELLO <cr> Hello, World! The word CR (Carriage Return) causes the following output to be displayed on a new line. The parsing word ." (dot-quote) reads a double-quote delimited string and appends code to the current definition so that the parsed string will be displayed on execution. The space character separating the word ." from the string Hello, World! is not included as part of the string. It is needed so that the parser recognizes ." as a Forth word. A standard Forth system is also an interpreter, and the same output can be obtained by typing the following code fragment into the Forth console: CR .( Hello, World!) .( (dot-paren) is an immediate word that parses a parenthesis-delimited string and displays it. As with the word ." the space character separating .( from Hello, World! is not part of the string. The word CR comes before the text to print. By convention, the Forth interpreter does not start output on a new line. Also by convention, the interpreter waits for input at the end of the previous line, after an ok prompt. There is no implied "flush-buffer" action in Forth's CR, as sometimes is in other programming languages.


Mixing states of compiling and interpreting

Here is the definition of a word EMIT-Q which when executed emits the single character Q: : EMIT-Q 81 ( the ASCII value for the character 'Q' ) EMIT ; This definition was written to use the
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
value of the Q character (81) directly. The text between the parentheses is a comment and is ignored by the compiler. The word EMIT takes a value from the data stack and displays the corresponding character. The following redefinition of EMIT-Q uses the words /nowiki> (left-bracket), /nowiki> (right-bracket), CHAR and LITERAL to temporarily switch to interpreter state, calculate the ASCII value of the Q character, return to compilation state and append the calculated value to the current colon definition: : EMIT-Q CHAR Q LITERAL EMIT ; The parsing word CHAR takes a space-delimited word as parameter and places the value of its first character on the data stack. The word
HAR Har or HAR may refer to: People * Har Bilas Sarda (1867-1955), Indian academic, judge and politician * Har Sharma (1922–1992), Indian cricket umpire Mythology * Hár and Hárr, among the many names of Odin in Norse mythology * Horus, an Eg ...
/nowiki>
is an immediate version of CHAR. Using
HAR Har or HAR may refer to: People * Har Bilas Sarda (1867-1955), Indian academic, judge and politician * Har Sharma (1922–1992), Indian cricket umpire Mythology * Hár and Hárr, among the many names of Odin in Norse mythology * Horus, an Eg ...
/nowiki>
, the example definition for EMIT-Q could be rewritten like this: : EMIT-Q
HAR Har or HAR may refer to: People * Har Bilas Sarda (1867-1955), Indian academic, judge and politician * Har Sharma (1922–1992), Indian cricket umpire Mythology * Hár and Hárr, among the many names of Odin in Norse mythology * Horus, an Eg ...
Q EMIT ; \ Emit the single character 'Q'
This definition used \ (backslash) for the describing comment. Both CHAR and
HAR Har or HAR may refer to: People * Har Bilas Sarda (1867-1955), Indian academic, judge and politician * Har Sharma (1922–1992), Indian cricket umpire Mythology * Hár and Hárr, among the many names of Odin in Norse mythology * Horus, an Eg ...
/nowiki>
are predefined in Forth. Using IMMEDIATE and POSTPONE,
HAR Har or HAR may refer to: People * Har Bilas Sarda (1867-1955), Indian academic, judge and politician * Har Sharma (1922–1992), Indian cricket umpire Mythology * Hár and Hárr, among the many names of Odin in Norse mythology * Horus, an Eg ...
/nowiki>
could have been defined like this: :
HAR Har or HAR may refer to: People * Har Bilas Sarda (1867-1955), Indian academic, judge and politician * Har Sharma (1922–1992), Indian cricket umpire Mythology * Hár and Hárr, among the many names of Odin in Norse mythology * Horus, an Eg ...
CHAR POSTPONE LITERAL ; IMMEDIATE


A complete RC4 cipher program

In 1987,
Ron Rivest Ronald Linn Rivest (; born May 6, 1947) is a cryptographer and an Institute Professor at MIT. He is a member of MIT's Department of Electrical Engineering and Computer Science (EECS) and a member of MIT's Computer Science and Artificial Intell ...
developed the
RC4 In cryptography, RC4 (Rivest Cipher 4, also known as ARC4 or ARCFOUR, meaning Alleged RC4, see below) is a stream cipher. While it is remarkable for its simplicity and speed in software, multiple vulnerabilities have been discovered in RC4, ren ...
cipher-system for RSA Data Security, Inc. Its description follows: The following Standard Forth version uses Core and Core Extension words only. 0 value ii 0 value jj 0 value KeyAddr 0 value KeyLen create SArray 256 allot \ state array of 256 bytes : KeyArray KeyLen mod KeyAddr ; : get_byte + c@ ; : set_byte + c! ; : as_byte 255 and ; : reset_ij 0 TO ii 0 TO jj ; : i_update 1 + as_byte TO ii ; : j_update ii SArray get_byte + as_byte TO jj ; : swap_s_ij jj SArray get_byte ii SArray get_byte jj SArray set_byte ii SArray set_byte ; : rc4_init ( KeyAddr KeyLen -- ) 256 min TO KeyLen TO KeyAddr 256 0 DO i i SArray set_byte LOOP reset_ij BEGIN ii KeyArray get_byte jj + j_update swap_s_ij ii 255 < WHILE ii i_update REPEAT reset_ij ; : rc4_byte ii i_update jj j_update swap_s_ij ii SArray get_byte jj SArray get_byte + as_byte SArray get_byte xor ; This is one of many ways to test the code: hex create AKey 61 c, 8A c, 63 c, D2 c, FB c, : test cr 0 DO rc4_byte . LOOP cr ; AKey 5 rc4_init 2C F9 4C EE DC 5 test \ output should be: F1 38 29 C9 DE


Implementations

Because Forth is simple to implement and has no standard reference implementation, there are numerous versions of the language. In addition to supporting the standard varieties of desktop computer systems (
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming interf ...
,
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
,
macOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lapt ...
), many of these Forth systems also target a variety of
embedded systems An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' as ...
. Listed here are some of the systems which conform to the 1994 Forth standard. * ASYST, a Forth-like system for data collection and analysis *
Gforth Gforth is a free and portable implementation of the Forth programming language for Unix-like systems, Microsoft Windows, and other operating systems. A primary goal of Gforth is to adhere to the ANS Forth standard. Gforth is free software as par ...
, a portable Forth implementation from the
GNU Project The GNU Project () is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and computing devices by collaborati ...

noForth
an ANS Forth implementation (as far as possible) for Flash microcontrollers (MSP430 & Risc-V) *
Open Firmware Open Firmware is a standard defining the interfaces of a computer firmware system, formerly endorsed by the Institute of Electrical and Electronics Engineers (IEEE). It originated at Sun Microsystems, where it was known as OpenBoot, and has bee ...
, a
bootloader A bootloader, also spelled as boot loader or called boot manager and bootstrap loader, is a computer program that is responsible for booting a computer. When a computer is turned off, its softwareincluding operating systems, application code, an ...
and
Firmware In computing, firmware is a specific class of computer software that provides the low-level control for a device's specific hardware. Firmware, such as the BIOS of a personal computer, may contain basic functions of a device, and may provide h ...
standard based on Forth * pForth, portable Forth written in C * SP-Forth, Forth implementation from the Russian Forth Interest Group (RuFIG) * Swift Forth, machine code generating implementation from Forth, Inc. * VFX Forth, optimizing native code Forth


See also

*
RTX2010 The RTX2010, manufactured by Intersil, is a radiation hardened stack machine microprocessor which has been used in numerous spacecraft. Characteristics It is a two-stack machine, each stack 256 words deep, that supports direct execution of Fort ...
, a CPU that runs Forth natively


Notes


References


Further reading

* * * * * * * * * * * * *


External links


Programming a problem-oriented language
an unpublished book, by Charles H. Moore, June 1970 {{DEFAULTSORT:Forth (Programming Language) Forth programming language family Concatenative programming languages Stack-based virtual machines Systems programming languages Programming languages created in 1970 Extensible syntax programming languages Programming languages with an ISO standard Programming languages 1970 software Articles with example C code